Overview

Dataset statistics

Number of variables14
Number of observations260682
Missing cells39793
Missing cells (%)1.1%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory27.8 MiB
Average record size in memory112.0 B

Variable types

Categorical4
DateTime1
Numeric9

Warnings

VERSIE has constant value "1.0" Constant
DATUM_BESTAND has constant value "2021-02-09" Constant
PEILDATUM has constant value "2021-02-01" Constant
TYPERENDE_DIAGNOSE_CD has a high cardinality: 1766 distinct values High cardinality
AANTAL_PAT_PER_ZPD is highly correlated with AANTAL_SUBTRAJECT_PER_ZPDHigh correlation
AANTAL_SUBTRAJECT_PER_ZPD is highly correlated with AANTAL_PAT_PER_ZPDHigh correlation
AANTAL_PAT_PER_DIAG is highly correlated with AANTAL_SUBTRAJECT_PER_DIAGHigh correlation
AANTAL_SUBTRAJECT_PER_DIAG is highly correlated with AANTAL_PAT_PER_DIAGHigh correlation
AANTAL_PAT_PER_SPC is highly correlated with AANTAL_SUBTRAJECT_PER_SPCHigh correlation
AANTAL_SUBTRAJECT_PER_SPC is highly correlated with AANTAL_PAT_PER_SPCHigh correlation
PEILDATUM is highly correlated with DATUM_BESTAND and 1 other fieldsHigh correlation
DATUM_BESTAND is highly correlated with PEILDATUM and 1 other fieldsHigh correlation
VERSIE is highly correlated with PEILDATUM and 1 other fieldsHigh correlation
GEMIDDELDE_VERKOOPPRIJS has 39793 (15.3%) missing values Missing
AANTAL_SUBTRAJECT_PER_ZPD is highly skewed (γ1 = 21.01127808) Skewed

Reproduction

Analysis started2021-02-23 10:28:29.752753
Analysis finished2021-02-23 10:29:01.335022
Duration31.58 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

VERSIE
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
1.0
260682 

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters782046
Distinct characters3
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1.0
2nd row1.0
3rd row1.0
4th row1.0
5th row1.0
ValueCountFrequency (%)
1.0260682
100.0%
2021-02-23T10:29:01.521976image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-23T10:29:01.759161image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
1.0260682
100.0%

Most occurring characters

ValueCountFrequency (%)
1260682
33.3%
.260682
33.3%
0260682
33.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number521364
66.7%
Other Punctuation260682
33.3%

Most frequent character per category

ValueCountFrequency (%)
1260682
50.0%
0260682
50.0%
ValueCountFrequency (%)
.260682
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common782046
100.0%

Most frequent character per script

ValueCountFrequency (%)
1260682
33.3%
.260682
33.3%
0260682
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII782046
100.0%

Most frequent character per block

ValueCountFrequency (%)
1260682
33.3%
.260682
33.3%
0260682
33.3%

DATUM_BESTAND
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
2021-02-09
260682 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters2606820
Distinct characters5
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-02-09
2nd row2021-02-09
3rd row2021-02-09
4th row2021-02-09
5th row2021-02-09
ValueCountFrequency (%)
2021-02-09260682
100.0%
2021-02-23T10:29:01.957529image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-23T10:29:02.038108image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
2021-02-09260682
100.0%

Most occurring characters

ValueCountFrequency (%)
2782046
30.0%
0782046
30.0%
-521364
20.0%
1260682
 
10.0%
9260682
 
10.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2085456
80.0%
Dash Punctuation521364
 
20.0%

Most frequent character per category

ValueCountFrequency (%)
2782046
37.5%
0782046
37.5%
1260682
 
12.5%
9260682
 
12.5%
ValueCountFrequency (%)
-521364
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2606820
100.0%

Most frequent character per script

ValueCountFrequency (%)
2782046
30.0%
0782046
30.0%
-521364
20.0%
1260682
 
10.0%
9260682
 
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2606820
100.0%

Most frequent character per block

ValueCountFrequency (%)
2782046
30.0%
0782046
30.0%
-521364
20.0%
1260682
 
10.0%
9260682
 
10.0%

PEILDATUM
Categorical

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
2021-02-01
260682 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters2606820
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row2021-02-01
2nd row2021-02-01
3rd row2021-02-01
4th row2021-02-01
5th row2021-02-01
ValueCountFrequency (%)
2021-02-01260682
100.0%
2021-02-23T10:29:02.238129image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
2021-02-23T10:29:02.319204image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
ValueCountFrequency (%)
2021-02-01260682
100.0%

Most occurring characters

ValueCountFrequency (%)
2782046
30.0%
0782046
30.0%
1521364
20.0%
-521364
20.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number2085456
80.0%
Dash Punctuation521364
 
20.0%

Most frequent character per category

ValueCountFrequency (%)
2782046
37.5%
0782046
37.5%
1521364
25.0%
ValueCountFrequency (%)
-521364
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common2606820
100.0%

Most frequent character per script

ValueCountFrequency (%)
2782046
30.0%
0782046
30.0%
1521364
20.0%
-521364
20.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII2606820
100.0%

Most frequent character per block

ValueCountFrequency (%)
2782046
30.0%
0782046
30.0%
1521364
20.0%
-521364
20.0%

JAAR
Date

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
Minimum2012-01-01 00:00:00
Maximum2021-01-01 00:00:00
2021-02-23T10:29:02.385409image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:29:02.492562image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=10)

BEHANDELEND_SPECIALISME_CD
Real number (ℝ≥0)

Distinct27
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean421.425108
Minimum301
Maximum8418
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2021-02-23T10:29:02.628979image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum301
5-th percentile302
Q1305
median313
Q3322
95-th percentile335
Maximum8418
Range8117
Interquartile range (IQR)17

Descriptive statistics

Standard deviation919.5893703
Coefficient of variation (CV)2.182094405
Kurtosis71.47442542
Mean421.425108
Median Absolute Deviation (MAD)8
Skewness8.564862963
Sum109857940
Variance845644.6101
MonotocityNot monotonic
2021-02-23T10:29:02.763698image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=27)
ValueCountFrequency (%)
30537016
14.2%
31333709
12.9%
30329972
11.5%
33020832
 
8.0%
31617819
 
6.8%
30813506
 
5.2%
30610895
 
4.2%
32410872
 
4.2%
30110517
 
4.0%
3048482
 
3.3%
Other values (17)67062
25.7%
ValueCountFrequency (%)
30110517
 
4.0%
3025659
 
2.2%
30329972
11.5%
3048482
 
3.3%
30537016
14.2%
ValueCountFrequency (%)
84183395
1.3%
1900171
 
0.1%
390672
 
0.3%
3892810
1.1%
3623925
1.5%

TYPERENDE_DIAGNOSE_CD
Categorical

HIGH CARDINALITY

Distinct1766
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size2.0 MiB
101
 
1090
402
 
1064
403
 
1037
301
 
1030
203
 
976
Other values (1761)
255485 

Length

Max length4
Median length3
Mean length3.350776041
Min length2

Characters and Unicode

Total characters873487
Distinct characters25
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3 ?
Unique (%)< 0.1%

Sample

1st row559
2nd row557
3rd row554
4th row559
5th row554
ValueCountFrequency (%)
1011090
 
0.4%
4021064
 
0.4%
4031037
 
0.4%
3011030
 
0.4%
203976
 
0.4%
201971
 
0.4%
401873
 
0.3%
404862
 
0.3%
802852
 
0.3%
409844
 
0.3%
Other values (1756)251083
96.3%
2021-02-23T10:29:03.115622image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1011090
 
0.4%
4021064
 
0.4%
4031037
 
0.4%
3011030
 
0.4%
203976
 
0.4%
201971
 
0.4%
401873
 
0.3%
404862
 
0.3%
802852
 
0.3%
409844
 
0.3%
Other values (1756)251083
96.3%

Most occurring characters

ValueCountFrequency (%)
1167361
19.2%
0159671
18.3%
2115731
13.2%
394856
10.9%
566917
 
7.7%
963228
 
7.2%
462258
 
7.1%
751430
 
5.9%
645768
 
5.2%
837475
 
4.3%
Other values (15)8792
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number864695
99.0%
Uppercase Letter8792
 
1.0%

Most frequent character per category

ValueCountFrequency (%)
G1635
18.6%
M1467
16.7%
B1051
12.0%
E758
8.6%
Z694
7.9%
D599
 
6.8%
A578
 
6.6%
F563
 
6.4%
C291
 
3.3%
K282
 
3.2%
Other values (5)874
9.9%
ValueCountFrequency (%)
1167361
19.4%
0159671
18.5%
2115731
13.4%
394856
11.0%
566917
 
7.7%
963228
 
7.3%
462258
 
7.2%
751430
 
5.9%
645768
 
5.3%
837475
 
4.3%

Most occurring scripts

ValueCountFrequency (%)
Common864695
99.0%
Latin8792
 
1.0%

Most frequent character per script

ValueCountFrequency (%)
G1635
18.6%
M1467
16.7%
B1051
12.0%
E758
8.6%
Z694
7.9%
D599
 
6.8%
A578
 
6.6%
F563
 
6.4%
C291
 
3.3%
K282
 
3.2%
Other values (5)874
9.9%
ValueCountFrequency (%)
1167361
19.4%
0159671
18.5%
2115731
13.4%
394856
11.0%
566917
 
7.7%
963228
 
7.3%
462258
 
7.2%
751430
 
5.9%
645768
 
5.3%
837475
 
4.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII873487
100.0%

Most frequent character per block

ValueCountFrequency (%)
1167361
19.2%
0159671
18.3%
2115731
13.2%
394856
10.9%
566917
 
7.7%
963228
 
7.2%
462258
 
7.1%
751430
 
5.9%
645768
 
5.2%
837475
 
4.3%
Other values (15)8792
 
1.0%

ZORGPRODUCT_CD
Real number (ℝ≥0)

Distinct5893
Distinct (%)2.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean441790049.6
Minimum10501002
Maximum998418081
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2021-02-23T10:29:03.275457image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum10501002
5-th percentile28999037
Q199799061
median149599030
Q3990004004
95-th percentile990416051
Maximum998418081
Range987917079
Interquartile range (IQR)890204943

Descriptive statistics

Standard deviation429303066.6
Coefficient of variation (CV)0.9717354817
Kurtosis-1.742027437
Mean441790049.6
Median Absolute Deviation (MAD)119700019
Skewness0.4627499483
Sum1.151667137 × 1014
Variance1.84301123 × 1017
MonotocityNot monotonic
2021-02-23T10:29:03.443403image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
9900040091898
 
0.7%
9900040071869
 
0.7%
9900030041868
 
0.7%
9900040061510
 
0.6%
9903560761340
 
0.5%
9903560731242
 
0.5%
9900030071197
 
0.5%
1319992281139
 
0.4%
1319991641126
 
0.4%
1992990131085
 
0.4%
Other values (5883)246408
94.5%
ValueCountFrequency (%)
105010026
< 0.1%
105010039
< 0.1%
105010049
< 0.1%
105010059
< 0.1%
105010073
 
< 0.1%
ValueCountFrequency (%)
998418081127
< 0.1%
998418080113
< 0.1%
99841807933
 
< 0.1%
9984180776
 
< 0.1%
9984180766
 
< 0.1%

AANTAL_PAT_PER_ZPD
Real number (ℝ≥0)

HIGH CORRELATION

Distinct9007
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean508.3764318
Minimum1
Maximum154837
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2021-02-23T10:29:03.610555image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median14
Q3103
95-th percentile1727
Maximum154837
Range154836
Interquartile range (IQR)100

Descriptive statistics

Standard deviation3126.057126
Coefficient of variation (CV)6.14909923
Kurtosis389.7948348
Mean508.3764318
Median Absolute Deviation (MAD)13
Skewness16.44053249
Sum132524585
Variance9772233.152
MonotocityNot monotonic
2021-02-23T10:29:03.775349image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
142850
 
16.4%
221110
 
8.1%
313723
 
5.3%
410169
 
3.9%
57889
 
3.0%
66628
 
2.5%
75555
 
2.1%
84714
 
1.8%
94293
 
1.6%
103785
 
1.5%
Other values (8997)139966
53.7%
ValueCountFrequency (%)
142850
16.4%
221110
8.1%
313723
 
5.3%
410169
 
3.9%
57889
 
3.0%
ValueCountFrequency (%)
1548371
< 0.1%
1548231
< 0.1%
1538841
< 0.1%
1447021
< 0.1%
1143541
< 0.1%

AANTAL_SUBTRAJECT_PER_ZPD
Real number (ℝ≥0)

HIGH CORRELATION
SKEWED

Distinct9624
Distinct (%)3.7%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean594.1707483
Minimum1
Maximum239907
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2021-02-23T10:29:03.951129image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q13
median15
Q3112
95-th percentile1957
Maximum239907
Range239906
Interquartile range (IQR)109

Descriptive statistics

Standard deviation3958.105553
Coefficient of variation (CV)6.661562463
Kurtosis707.3255652
Mean594.1707483
Median Absolute Deviation (MAD)14
Skewness21.01127808
Sum154889619
Variance15666599.57
MonotocityNot monotonic
2021-02-23T10:29:04.117369image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
141308
 
15.8%
220771
 
8.0%
313580
 
5.2%
49990
 
3.8%
57798
 
3.0%
66631
 
2.5%
75509
 
2.1%
84661
 
1.8%
94216
 
1.6%
103839
 
1.5%
Other values (9614)142379
54.6%
ValueCountFrequency (%)
141308
15.8%
220771
8.0%
313580
 
5.2%
49990
 
3.8%
57798
 
3.0%
ValueCountFrequency (%)
2399071
< 0.1%
2324841
< 0.1%
2313181
< 0.1%
2276571
< 0.1%
2213901
< 0.1%

AANTAL_PAT_PER_DIAG
Real number (ℝ≥0)

HIGH CORRELATION

Distinct7868
Distinct (%)3.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7669.618067
Minimum1
Maximum214611
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2021-02-23T10:29:04.441126image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile44
Q1412
median1725
Q36409
95-th percentile36788
Maximum214611
Range214610
Interquartile range (IQR)5997

Descriptive statistics

Standard deviation17662.42092
Coefficient of variation (CV)2.302907493
Kurtosis32.77559046
Mean7669.618067
Median Absolute Deviation (MAD)1564
Skewness4.986479708
Sum1999331377
Variance311961112.6
MonotocityNot monotonic
2021-02-23T10:29:04.618214image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
21450
 
0.2%
12374
 
0.1%
17372
 
0.1%
25370
 
0.1%
37360
 
0.1%
23360
 
0.1%
19355
 
0.1%
26354
 
0.1%
6345
 
0.1%
15337
 
0.1%
Other values (7858)257005
98.6%
ValueCountFrequency (%)
1265
0.1%
2281
0.1%
3291
0.1%
4321
0.1%
5276
0.1%
ValueCountFrequency (%)
21461123
< 0.1%
21213625
< 0.1%
20982019
< 0.1%
20836817
< 0.1%
20423917
< 0.1%

AANTAL_SUBTRAJECT_PER_DIAG
Real number (ℝ≥0)

HIGH CORRELATION

Distinct8704
Distinct (%)3.3%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean10818.26704
Minimum1
Maximum345671
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2021-02-23T10:29:04.804241image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile55
Q1544
median2374
Q38944
95-th percentile51444
Maximum345671
Range345670
Interquartile range (IQR)8400

Descriptive statistics

Standard deviation25732.01422
Coefficient of variation (CV)2.378570811
Kurtosis37.08158505
Mean10818.26704
Median Absolute Deviation (MAD)2172
Skewness5.266797295
Sum2820127489
Variance662136555.6
MonotocityNot monotonic
2021-02-23T10:29:04.984494image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
38343
 
0.1%
24335
 
0.1%
13306
 
0.1%
25292
 
0.1%
17291
 
0.1%
18291
 
0.1%
10289
 
0.1%
48279
 
0.1%
20276
 
0.1%
93270
 
0.1%
Other values (8694)257710
98.9%
ValueCountFrequency (%)
1215
0.1%
2219
0.1%
3247
0.1%
4257
0.1%
5236
0.1%
ValueCountFrequency (%)
34567125
< 0.1%
34314123
< 0.1%
34052519
< 0.1%
32371420
< 0.1%
30577117
< 0.1%

AANTAL_PAT_PER_SPC
Real number (ℝ≥0)

HIGH CORRELATION

Distinct243
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean670993.5003
Minimum38
Maximum1489512
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2021-02-23T10:29:05.160541image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum38
5-th percentile44662
Q1286650
median744724
Q3995517
95-th percentile1345280
Maximum1489512
Range1489474
Interquartile range (IQR)708867

Descriptive statistics

Standard deviation413081.1368
Coefficient of variation (CV)0.6156261374
Kurtosis-1.07409858
Mean670993.5003
Median Absolute Deviation (MAD)313635
Skewness0.0473962123
Sum1.749159277 × 1011
Variance1.706360256 × 1011
MonotocityNot monotonic
2021-02-23T10:29:05.339252image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
8809675102
 
2.0%
8742594354
 
1.7%
8439964348
 
1.7%
8930434332
 
1.7%
8749564271
 
1.6%
8338874138
 
1.6%
10842303891
 
1.5%
10637713851
 
1.5%
10774893847
 
1.5%
10389953810
 
1.5%
Other values (233)218738
83.9%
ValueCountFrequency (%)
381
 
< 0.1%
97581
< 0.1%
1426125
< 0.1%
1953131
0.1%
2584173
0.1%
ValueCountFrequency (%)
14895122976
1.1%
14506343054
1.2%
14218713564
1.4%
13452803543
1.4%
13329193546
1.4%

AANTAL_SUBTRAJECT_PER_SPC
Real number (ℝ≥0)

HIGH CORRELATION

Distinct243
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1061710.004
Minimum73
Maximum2582832
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2021-02-23T10:29:05.566106image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum73
5-th percentile52194
Q1483178
median1014348
Q31729123
95-th percentile2455216
Maximum2582832
Range2582759
Interquartile range (IQR)1245945

Descriptive statistics

Standard deviation715343.2068
Coefficient of variation (CV)0.6737651563
Kurtosis-0.8721171883
Mean1061710.004
Median Absolute Deviation (MAD)643653
Skewness0.3279760176
Sum2.767686873 × 1011
Variance5.117159035 × 1011
MonotocityNot monotonic
2021-02-23T10:29:05.752988image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
12118125102
 
2.0%
12815474354
 
1.7%
12162864348
 
1.7%
13130004332
 
1.7%
12900954271
 
1.6%
12275534138
 
1.6%
25572373891
 
1.5%
24890973851
 
1.5%
25828323847
 
1.5%
20664223810
 
1.5%
Other values (233)218738
83.9%
ValueCountFrequency (%)
731
 
< 0.1%
106981
< 0.1%
1645125
< 0.1%
2237131
0.1%
2924173
0.1%
ValueCountFrequency (%)
25828323847
1.5%
25572373891
1.5%
24890973851
1.5%
24552163810
1.5%
21847103757
1.4%

GEMIDDELDE_VERKOOPPRIJS
Real number (ℝ≥0)

MISSING

Distinct3176
Distinct (%)1.4%
Missing39793
Missing (%)15.3%
Infinite0
Infinite (%)0.0%
Mean3510.165196
Minimum60
Maximum287220
Zeros0
Zeros (%)0.0%
Memory size2.0 MiB
2021-02-23T10:29:05.937744image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Quantile statistics

Minimum60
5-th percentile140
Q1465
median1235
Q34040
95-th percentile13275
Maximum287220
Range287160
Interquartile range (IQR)3575

Descriptive statistics

Standard deviation6590.50629
Coefficient of variation (CV)1.877548754
Kurtosis169.2048005
Mean3510.165196
Median Absolute Deviation (MAD)1005
Skewness7.857079711
Sum775356880
Variance43434773.15
MonotocityNot monotonic
2021-02-23T10:29:06.110637image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1051850
 
0.7%
1601810
 
0.7%
1101465
 
0.6%
1801380
 
0.5%
5001325
 
0.5%
1201317
 
0.5%
1851293
 
0.5%
1451262
 
0.5%
3001250
 
0.5%
1401212
 
0.5%
Other values (3166)206725
79.3%
(Missing)39793
 
15.3%
ValueCountFrequency (%)
602
 
< 0.1%
70226
 
0.1%
7575
 
< 0.1%
80361
 
0.1%
85908
0.3%
ValueCountFrequency (%)
2872208
< 0.1%
1489103
 
< 0.1%
1428554
< 0.1%
1221554
< 0.1%
1167653
 
< 0.1%

Interactions

2021-02-23T10:28:44.533371image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:44.744031image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:44.960143image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:45.168655image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:45.372721image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:45.588203image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:45.801105image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:46.003137image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:46.214056image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:46.418016image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:46.621185image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:46.815662image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:47.005697image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:47.207345image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:47.406910image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:47.596139image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:47.794262image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:48.010072image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:48.215297image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:48.423740image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:48.628320image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:48.844108image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:49.057466image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:49.261219image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:49.472939image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:49.675331image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:50.032082image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:50.234641image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:50.421692image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:50.621258image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:50.821014image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:51.009731image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:51.203615image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:51.398431image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:51.584883image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:51.780689image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:51.966924image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:52.160863image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:52.353186image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:52.535430image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:52.723231image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:52.936250image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:53.138662image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:53.351183image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:53.555453image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:53.754186image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:53.964799image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:54.162819image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:54.368879image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:54.581720image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:54.783913image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:54.998331image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:55.203390image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:55.560067image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:55.770884image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:55.970121image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:56.176003image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:56.379584image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:56.571942image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:56.774714image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:56.969013image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:57.158523image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:57.358402image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:57.552982image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:57.743939image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:57.946126image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:58.138855image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:58.340632image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:58.533722image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:58.721043image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:58.920188image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
2021-02-23T10:28:59.118056image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Correlations

2021-02-23T10:29:06.279091image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-02-23T10:29:06.573403image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-02-23T10:29:06.863222image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-02-23T10:29:07.334410image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-02-23T10:29:07.580841image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-02-23T10:28:59.603750image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
A simple visualization of nullity by column.
2021-02-23T10:29:00.232741image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-02-23T10:29:00.993767image/svg+xmlMatplotlib v3.3.4, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

VERSIEDATUM_BESTANDPEILDATUMJAARBEHANDELEND_SPECIALISME_CDTYPERENDE_DIAGNOSE_CDZORGPRODUCT_CDAANTAL_PAT_PER_ZPDAANTAL_SUBTRAJECT_PER_ZPDAANTAL_PAT_PER_DIAGAANTAL_SUBTRAJECT_PER_DIAGAANTAL_PAT_PER_SPCAANTAL_SUBTRAJECT_PER_SPCGEMIDDELDE_VERKOOPPRIJS
01.02021-02-092021-02-012016-01-0130155970401002586044455485119110819101041425.0
11.02021-02-092021-02-012016-01-013015577040100215154429056006119110819101041425.0
21.02021-02-092021-02-012016-01-01301554704010021259614295200180288419119110819101041425.0
31.02021-02-092021-02-012016-01-0130155970401003664445548511911081910104625.0
41.02021-02-092021-02-012016-01-0130155470401003979820018028841911911081910104625.0
51.02021-02-092021-02-012016-01-01301557704010031010442905600611911081910104625.0
61.02021-02-092021-02-012016-01-013015577040100444442905600611911081910104NaN
71.02021-02-092021-02-012016-01-013015597040100410104445548511911081910104NaN
81.02021-02-092021-02-012016-01-0130155470401004697020018028841911911081910104NaN
91.02021-02-092021-02-012016-01-01301554704010062220018028841911911081910104NaN

Last rows

VERSIEDATUM_BESTANDPEILDATUMJAARBEHANDELEND_SPECIALISME_CDTYPERENDE_DIAGNOSE_CDZORGPRODUCT_CDAANTAL_PAT_PER_ZPDAANTAL_SUBTRAJECT_PER_ZPDAANTAL_PAT_PER_DIAGAANTAL_SUBTRAJECT_PER_DIAGAANTAL_PAT_PER_SPCAANTAL_SUBTRAJECT_PER_SPCGEMIDDELDE_VERKOOPPRIJS
2606721.02021-02-092021-02-012016-01-018418304998418081115235384259545568NaN
2606731.02021-02-092021-02-012016-01-018418305998418081224054214259545568NaN
2606741.02021-02-092021-02-012016-01-018418202998418081337007344259545568NaN
2606751.02021-02-092021-02-012016-01-0184185039984180811111353337104259545568NaN
2606761.02021-02-092021-02-012016-01-0184183039984180812222622363894259545568NaN
2606771.02021-02-092021-02-012016-01-01841830999841808188223623034259545568NaN
2606781.02021-02-092021-02-012016-01-01841820399841808122180018494259545568NaN
2606791.02021-02-092021-02-012016-01-0184185049984180811717455447404259545568NaN
2606801.02021-02-092021-02-012016-01-01841840299841808110107428724259545568NaN
2606811.02021-02-092021-02-012016-01-018418209998418081223914074259545568NaN